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ABSTRACT 

Fungus-eater games yield a class of sequential decision 
tasks which involve both means objects and end objects in a 
not-too-unrealistic fashion. Study of human strategies and 
of the optimal strategies in these games may improve our 
understanding of complex dynamic decision tasks. 

The optimal strategy for the fourth fungus-eater game, 
which depends upon the level of fungus storage, is derived 
and the behavior of human playing this game is reported. 
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OPTIMAL STRATEGIES AND HUMAN BEHAVIOR IN FUNGUS-EATER GAME 4 

Jun-ichi Nakahara and Masanao Toda 

In this paper we derive the optimal strategies for the finite, non- 
coexistent V-span 1 game which we call game 4, G4. The reader is 
assumed to be familiar with the three preceding papers of this series 
(Toda, 1962; 1963a; and 1963b) but just for refreshing the reader's memory 
a brief summary of the first issues cited above, in particular, the part 
relevant to the present article, will be presented in the first section of 
this paper. 

1. The structure of discrete F-E games and their optimal strategies 

A F-E game is said to be discrete if the F-E is allowed to move only 

along a certain branch structure. Each branching point of the branch 

structure is said to be a choice point for the F-E. Retracing of the 

same branch is forbidden. 

If there exists always two new branches at each choice point, the 

game is said to be binary. When the two new branches always converge 

to the same choice point, the branch structure is said to be a chain. 

There are several kinds of branch structure , and one theorem concerning 

the influence of branch structure. 

THEOREM 1: If the environment is binary and homogeneous, and the size 
of V-span is one, the optimal decision function for a well-informed F-E 
is independent of branch structure of the environment. 



1 See Toda (1963a). 



Proof of this theorem may be found in Toda (1963a) and the notion of 
homogeneous, well-informed, and V-span will be explained later in this 
section. 

G4 is a binary, homogeneous, V-span 1 game. 

The part of the branch starting from the choice point and ending at 
the next choice point, but not including the choice points themselves, is 
called a path. The whole set of paths starting from the same choice point 
is said to form a unit environment. In a binary game, every unit environ- 
ment consists of two paths. 

The two kinds of substances which can exist in a unit environment are 
F(fungus) and U(uranium). They do not exist on choice points but on paths. 

A F-E game is said to be singular if not more than one U and not more 
than one F can exist in any single unit environment provided by the game. 
If to a singular game is added one more constraint such that not more than 
one substance can exist on the same path, the game is said to be exclusive. 
G4 is a singular and exclusive game. 

If the environment is singular and exclusive, there can be only four 
types of entries in each unit environment, (F U) , (F O) , (O U) , and (O O) , 
where (F U) means that one fungus exists on one of the paths constituting 
the unit environment and one uranium exists on the other path. (F O) means 
that fungus exists on one of the paths, and nothing on the other. And so on. 

If the probability distribution over the alternative types of unit 
environment encountered at each choice point is independent of the choice 
point, the game is said to be homogeneous. G4 is homogeneous. 

The state of F-E (a player of a F-E game) is characterized by three 
variables, i.e. , the F-storage, the U-storage, and the L-storage. 



2 - 



At the beginning of a play, F-E is given certain initial amount of these 
three variables which may be finite or infinite. The F-storage regularly 
decreases by one unit as F-E moves from one choice point to the next. 
Whenever one F is picked up by F-E on the way, F-storage increases by 
a units. Therefore, the net gain of F-storage through the unit locomotion 
is a-1 units. When F-storage reaches zero, the play is finished. The 
U -storage increases by one unit whenever F-E takes one U. The L-storage, 
like F-storage, decreases by one unit as F-E moves from one choice point 
to the next. But, there is nothing, like fungus for F-storage, that 
increases L-storage. When L-storage reaches zero, the play of the game 
is finished. 

According to whether the initial L-storage is finite or infinite, the 
F-E game is said to be finite game or infinite game. G4 is a finite game. 

In G4 we also assume that the initial F-storage is finite, and that 
the initial U-storage is zero. 

When F-E is provided all necessary information concerning the structure 
of the game including the probability distributions of fungus and uranium on 
the unit environment, the F-E is said to be well-informed. The F-E 
in G4 is well-informed. Besides knowledge of the probability distribution, 
F-E can see, in general, the contents of unit environments near his choice 
point. If he sees starting from the unit environment belonging to his 
present choice point (the immediate unit environment) up to those belonging 
to the n-th possible choice points, the F-E is said to be of V-span n. 
The F-E in G4 is of V-span 1. 

The pay-off to F-E is proportional to the U-storage at the time when 
play is finished. 



The well-informed F-E has knowledge about the defining characteristics 
of the game, himself, and the environment, in particular, the probabilities 
governing the distributions of F and U in the unit environment. We call 
this set of knowledge the permanent decision context, P. If F-E has vision, 
he has at each choice point the information about the content in terms of 
F and U of the environment covered by his V-span. We call this set of 
knowledge of the external decision context, E. 

Furthermore, with the progress of the game, the internal state of the 
F-E defined by the variables like his F-storage, L-storage, and U-storage 
will change. We will call this set of knowledge about the F-E himself the 
internal decision context, I. 

A strategy, or we may call it a decision function, is a set of rules that 
dictates the F-E f s decision at each choice point, and since each choice 
point is virtually defined by the three decision contexts, P, E, and I, a 
strategy is a function of these three decision contexts, i. e. , 

D = D(P, E, I) . 

The optimal strategy, or the optimal decision function of the given game, 
is the strategy that maximizes the expected future U return. If F-E is 
well-informed, the expectation is the mathematical expectation in the 
ordinary sense. 

After a game is specified, the permanent decision context will no 
longer be variables. It comes into the game as a set of parameters. 

Decision contexts, I and E, vary at each choice point. The value 
of the decision function, D, is a decision. That is, it will take a value 
F or U according to the values of the three decision contexts, and it is 



unique when I and E are fixed. If a decision function maximizes the future 
U return at each choice point, then it is the optimal decision function. 

We define the expected future U return function, V, as a functional 
of the decision function and the three decision contexts when the F-E 
is well-informed. 

V = V {D(P, I, E) ; P, I, E } . 

V will attain its maximum when the decision function is the optimal 
one. Thus, the problem is to find the optimal decision function which 
maximizes the future U return, i.e. , 

(1) V* = Max V(D; P, I, E) 

{D} 

2. IC diagram and some remarks concerning G4 

Let us briefly talk about the IC (Internal Context) diagram which was 
introduced to the F-E game in G3 (Toda, 1963a). Our problem may most 
conveniently be visualized by using this diagram. 

This diagram shows the internal state of the F-E by a point located at 
one of the intersections of the grid shown in Fig. 1. 

As the F-E travels one step in the environment, the point moves either 
upward or to the right one unit, depending upon whether he takes F or not 
on the path to the next choice point. 

If the point reaches the line on the lower right of the diagram labeled 
"starvation absorption barrier," the point is absorbed there indicating that 
the F-E dies there by starvation. 
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The distance, x, to the starvation absorption barrier along a line 
parallel to the abscissa represents F-E*s F-storage, and thus x is equal 
to zero on the starvation line. Suppose a F-E starting with x F-storage 
never takes fungus at all. Then the point of his internal state keeps 
moving to the right one unit on each trial, and reaches the starvation 
line. Certainly he must die after x trials if he never takes fungus during 
the trip. So, any point located x units to the left of the starvation absorption 
line represents the internal context corresponding to the F-storage containing 
fungi sufficient for the F-E to negotiate x trials. It is easily seen in the 
diagram that taking a fungus increases F-storage a-1 units. 

Another absorption barrier in the figure is the "Dooms Day absorption 
barrier. " Any F-E with finite L-storage must die after traveling n steps, 
where n is a finite integer. He may die earlier by starvation, but he must 
die by n steps regardless of the decisions he made. So any point in this 
diagram representing the internal context n, namely L-storage, must locate 
at the distance of n units away (independently from the direction of movement) 
from this barrier. From this consideration the inclination of this barrier 
is automatically determined as shown in Fig. 1. 

The broken line in the middle of the diagram at y = is called the 
critical level, and the broken line one unit below the critical line is 
called the semi-critical level. Let us define any point locating y units 
below the critical level as representing the new relative internal context y. 
We call y the F-need, because the F-E is required to take at least y fungi 
to meet the D-day barrier. We will talk about y more explicitly in a 
later section. 



Thus, each different internal context will explicitly be represented by 
a point in the IC diagram. 

We define G4 as a finite, non- coexistent V-span 1 game, and the 
environment of this game is characterized as singular, and thus binary, 
exclusive, and homogeneous. Therefore, there are four types of unit 
environments, (F U) , (F O), (U O) , and (O O) , to which we assign 
probabilities fu, f(l-u), (l-f)u, and (l-f)(l-u), respectively, and which 
also exhaust the external decision contexts. Here, as indicated by the 
probability assignment, we assume independence between F distribution 
and U distribution over unit environment. The probability that F is found 
in a unit environment is f , and the probability for U in a unit environment 
is u. These two probabilities belong to the permanent decision context. 
The other parameter of the permanent decision context is a which represents 
the increase in F-storage due to taking one fungus. 

The parameters u, f, and a constitute the permanent decision context. 
We have no other element of permanent decision context, so we may express 
the decision function as 

D(z, I; f, u, a) 

where z is a random variable and takes one of four values, (F U) , (F O) , 
(O U), or (O O) with probability fu, f(l-u) , (l-f)u , or (l-f)(l-u) respectively, 
and I represents the internal decision context. When the value of z is either 
(F O), (O U), or (O O) , the optimal decision is obvious, so that it is sufficient 
to solve the decision function for z = (F U). 
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Any trial in which the external decision context (F U) is given to F-E 
is a (non-trivial) decision trial. 

In G4 there is no internal context which is relevant to the decision 
function other than x, F-E T s F-storage in tuts unit, and y, F-E T s F-need 
to see the "Dooms Day absorption barrier. " 

Let us have a short descritpion about y, because we use y instead 
of n, F-E's L-storage, for describing the internal decision contexts. 

Suppose a F-E starts this F-E game with finite L-storage n and finite 
F-storage x. Assume n > x. Then, to live until Dooms-Day, the minimum 
number of F he must take is (n - x)/a. If this quotient comes out to be an 
integer, he is able to see that D-day without having any left-over F-storage. 
In this case we say that the D-day is in phase, and if not, it is out of phase. 
If the D-day is out of phase, the minimum number of F he needs to see the 
D-day is the smallest integer greater than the above quotient, and he will 
have some amount of left-over F-storage. 

Now y is defined as 

n - x 

y -y =— 



where y is an integer which shows the minimum number of F necessary to 
see the D-day as we described before, and £ is the phase. It takes on 
values between and 1, £ p £ 1* 

The phase is not so important a factor in determining the over -all 
decision strategy, but still we cannot ignore it. Sometimes it determines 
the relative value of the last F to see the D-day. If ]£is very close to one, 
almost all of the contents of the last F will be left-over. 



r 



See Toda (1963a) 



The value of u is determined completely by F-E's initial F-storage 
and initial L-storage, and remains unchanged throughout the rest of the 
game. 

What changes with age is y. The pair of variables (x, y) may as well 
describe the internal decision context as (x, n) and we will use the former 
in what follows. 

3. Optimal decision function 



As we have already shown that the only external decision context 
relevant to the optimal decision function is z = (F U). So our problem 
is to assign the optimal decision to each point of the IC diagram, i.e. , 
the decision that is effective when z = (F U) is given at that point. 

Now what are the alternative decisions? There are decision F, 
decision U, and all kinds of mixed decisions. Fortunately, however, 
according to the Theorem 7 given in Toda (1963a) we need not worry about 
mixed decisions in finite games. 

Thus, our problem is to find the optimal decision at each point on 
the IC diagram, decision "F" or decision M U ,f , which maximizes the 
expected future U return function V. 

Let us give a much more explicit expression to the equation (1). 
We can rewrite (1) as 

(2) V*(x , y) = Max { VfD^ x,y) , VfD^ x, y) } 

where D (or D ) is such a strategy that is identical to D*, the optimal 
decision function, except at I = (x , y) where it dictates to take F (or U) , 
if possible, whether it is optimal or not. 
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If actually F-E takes F, his internal context at the next choice point 
is I = (x+a-1, y-1) by the definition. If z = (O U) is given, he takes U and 
proceeds to the next choice point and his internal decision context will 
be I = (x - 1 , y) . If z = (O O) is given, he just proceeds to I = (x - 1 , y). 
Then, we have the explicit expression for V(D ; x, y) as 

r 

V(D F ; x, y) = fuV*(x + a - 1, y-1) 

+ f(l - u)V*(x+ a - 1, y-1) 
+ (1 - f)u {V*(x - 1, y) + 1} 
+ (1 - f)(l - u)V*(x - 1, y) 
= fV*(x + a - 1, y-1) 

+ (1 - f)V*(x - 1, y) + (1 - f)u . 
Analogously, for V(D TT ; x, y) 

V(D u; x,y) = fu {V*(x - 1, y) + 1} 

+ (1 - f)u {V*(x - 1, y) + 1} 
+ f(l - u)V*(x + A - 1, y-1) 
+ (1 - f) (1 - u)V*(x - 1, y) 
= f(l - u)V*(x + a - 1, y-1) 

+ (1 - f + fu)V*(x - 1, y) + u . 

By taking the difference between these two expected U gain functions, 
we can determine which one gives the greater expectation. 
V(D F ; x, y) - VfD^ x, y) 

= fu {V*(x+ a - 1, y-1) - V*(x - 1, y) - 1 } . 
So by defining a delta function 6(x , y) as follows 

(3) 6(x , y) = V*(x + a - 1, y - 1) - V*(x - 1 , y) - 1 
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the value of optimal decision function for a given internal decision context 
is definitely specified as follows 

(4) D*(z = (F U); x, y) = F if 6(x , y) > 

= O if 6(x , y) = 

= U if <3(x , y) < , 

where D* = means indifference between F and U. 

4. Analytical solution 

Now, just for the sake of simplicity, let us assume that the D-day 
is in phase. Then y = is the only critical level. 

By definition of y, the F-E whose value of y is zero can survive up 

to D- day with probability 1 . So giving up U for F when z = (F U) is 

2 

obviously sub-optimal. Theorem 8 given in Toda ! s paper states this 

explicitly as 

THEOREM 8: D*(z = (F U); x, y ) = U if y < and n > 

The expected U gain function for the critical level is directly derived from 
this Theorem as 

(5) V*( x, y = 0) = ux 

Note that we are assuming V = 

Now we have Theorem 8 and the equation (5) as the boundary conditions, 
and we proceed to solve the optimal decision function for y = 1. 



This will not be too much a simplification. It can be easily shown that 
the same procedure that we are going to use to solve this problem is 
applicable to the case of out of phase D-day. 

Toda (1963a) 
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Now, some really interesting features appear in G4. For example , 
the optimal decision not only depends on y but also on x. This has never 
been the case through Gl to G3. 

By putting y = 1 in (3) we have 

5 (x, y = 1) = V*(x+ a - 1, y = 0) - V*(x - l f y = 1) - 1 . 

What we want to solve for is that x which gives the neutral optimal 

decision, namely the value of x with which the optimal decision is 0, 

2 

5 (x, y) = . Let us denote this value of x as x 1 . Then, if x is greater 

than x the optimal decision should be U, and if x is smaller than x 1 , 

3 
the optimal decision should be F . Thus, this x is the critical 

decision-shifting point on the line y = 1. This x 1 will divide the line y = 1 

into two parts characterized by different optimal decisions. 

By obtaining the decision-shifting point x~ for y = 2, x„ for y = 3, 
and so on, the whole IC diagram will be divided into two regions, namely 
the U decision region and the Fdecision region. 

Thus, obtaining the decision-shifting point for each value of y is 
all we need for the optimal strategy in G4. In other words, our problem 
is to solve x satisfying the following equation for each value of y, y > 0: 

D* { z = (F U); x, y } =0 , 
or 

(6) 6 ( x, y) = . 



1 See the former report (Toda, 1963a) 

Though x is an integer in the discrete game like G4, here we regard x 
as a continuous variable for convenience. 

Actually we not only assume the uniqueness of the solution 5 ( x, y) = 
with respect to x, but also we assume the optimal decision F on each y 
line between the decision-shifting point and the starvation absorption 
line, and the optimal decision U on other parts of y line. 
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Then, x 1 is the solution of the equation (7), 

(7) 6 ( x, y= 1) = V*(x + a - 1 , y = 0) - V*(x - 1 , y + 1) - 1 

= . 

From (5) the first term of (7) is given as 

(8) V*(x + a - 1, y = 0) = u(x + a - 1) . 

If x is neutral decision point, x - 1 surely falls into the F decision 
region. Therefore, V#(x - 1, y = 1) is the expected U gain function in the 
F decision region, which means that the optimal F-E must take F whenever 
he comes to a fungus. So we have 

(9) V*(x - 1 , y = 1) = u(x - 1)(1 - f) x " l 

+ u(x+ a - 2) { 1 - (1 - f) x " l } 

where the first term shows the expected U gain under the condition that 
the F-E dies on the line y = 1 , and the second term shows the expected 
U gain under the condition that the F-E takes one fungus and goes up to 
y = line. By substituting (8) and (9) into (7), we have 

6( x, y = 1) = u(x + a - 2) (1 - f) x " l 

- u(x - 1) (1 - f) x " 1 + u - 1 

= . 

Therefore , 

,x - 1 1 - u 



(10) (1 - £)' 



u(a - 1) 

For convenience let us regard x as if it is continuous. Then we have 
(11) (x - 1) log(l - f) = log(l - u) - log u - log(a - 1) . 
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Therefore 
(12) 



_ log(l - u) - log u - log(a - 1) + x 
log(l - f) 



Thus, x- is obtained. 

What we shall do next will be to obtain the decision-shifting point 
x ? for y = 2 and higher values. But as one will soon see, to solve the 
decision-shifting points for higher values of y is not so easy as it was for 
y = 1. So we will just briefly portray an outline of the procedure for 
obtaining the analytical solution of x ? below, and will proceed to the next 
section where we shall discuss the numerical method for obtaining the 
decision-shifting points for higher values of y. 

The delta function for y = 2 is given as 



(13) 



6 (x, y = 2) = V*(x + a - 1 , y = 1) - V*(x - 1 , y = 2) - 1 



Let us define the 3 areas and the g areas in the IC diagram as they are 
illustrated in Fig. 2. 



y= 

y= i 



y = 2 




Fig. 2 
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Usually the solution x 1 of the equation (12) will not be an integer, but 
from now on, when we refer to x 1 , we shall mean the integral part of 
the solution x 1 of (12). Then I = (x, y = 1) locates in the F decision 
region and I=(x+l,y=l)in the U decision region . Consider the 
expected U gain function V*(x + a - 1, y = 1) which is the first term in the 
right hand side of (13) for x = x ? for the following three alternative cases: 

Case IF : The F-E at I = (x + a - 1 , y = 1) dies on the line y = 1 . 
The probability that the Case IF becomes true is 

(1 - f) X l • (1 - f + uf) X 2 + a ' l " X l 

and the expected U gain given Case IF is 

(1 - f) X l . (1 - f + uf) x 2 + a " l - x l - u(x + a - 1) 

Case 2F: The F-E climbs one level up from the area h 1 to the line y = 0. 
The probability that the Case 2F becomes true is 

( 1 - (1 - f) X l ) • (1 - f + uf) x 2 + a ' ] ' x l 

and the corresponding expected U gain is 

{ 1 - (1 - f) X l } • (1 - f + uf) x 2 + a ' l ' x l • u(x + 2a - 2) . 



Case 3F: The F-E climbs one level up from the area 9 to the line y = 0. 
The probability that the Case 3F becomes true is 

1 - (1 - f + uf) X 2 + a ' X " x l 



When the solution x, itself is an integer , I = ( x , y = 1) lies on the 
border of F and U regions. No harm is done, however, by stipulating 
that the border itself belongs to the F decision region. 
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and the corresponding expected U gain is 

|1 - (1 -f+ uf) x 2 + a " l - X ll- U(x + 2a " 2) • 
Therefore, V*(x + a - 1, y = 1) is expressed as the sum of expected U 
gain under these three cases: 

V*(x +a-l,y=l)= (1 - f) X l • (1 - f + uf) X 2 + a " X " X l • u(x + a - 1) 
+ { 1 - (1 - f) X l} -(1 - f + uf) X 2 + a " ! " x x . u (x - 2a - 2) 
+ { 1 - (1 - f + uf) X 2 + a " ! " x x } . u( x + 2a - 2) 

= u(x + 2a - 2) 
+ (1 - f) X l • (1 - f + uf) X 2 + a ' l ' X l • u(l - a) 

Our next step is to obtain an explicit expression for V#(x - 1 , y = 2) , 
the second term of (13), and this will again be done considering the 
following alternative cases separately. It is clear that if x ? is the decision- 
shifting point, x ? - 1 surely falls into the F decision region. 

Case 1U: The F-E dies on the line y = 2. The probability that the 
Case 1U becomes true is 

(1 - f) X 2 ' X 

and the expected U gain is 

(1 - f) X 2 " X • u(x 2 - 1) . 

Case 2U: The F-E dies on the line y = 1 . This case may further be 
classified into the following subclasses: 

Case 2U 3 : The F-E goes up to the line y = 1 from the area 9 ? and dies 
on that line. The probability that the Case 2U 3 will happen is 

f(l - f) X 2 + a " 2 . (x 2 - a + 1) 
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and the expected U gain is 

£(1 - £) X 2 + a " 2 . (x 2 - a + 1) • u(x 2 + a - 2) . 

Case 2U B : The F-E goes up to the line y_=j 1 from the area B ? and 

dies on that line. The probability for the Case 2U B is 

i (1 - f + uf}(l - f) X l {(1 - f + uf) X 2 + a ' 2 " X l-( 1 - f) X 2 + a ' 2 " x l} 

and the expected U gain is 

i (1 - f + uf)(l - f) X l { (1 - f + uf) X 2 + a " 2 " x l - (1 - f) x 2 + a ' 2 ' x l } 

• u(x ? + a - 2) 

Case 3U: The F-E goes up to the line y = 1, and further goes up to the 
line y = 0. This case must also be subdivided. 

Case 3U3 : The F-E goes up to the line y = 1 from the area 3 ? , and further 
goes up to the line y = from the area 3, . The probability for the 
Case 3U 3 is 

(1 - f) X 2 + a ' 2 " X l - (1 - f) X 2 ' l - f(l - f) X 2 + a " 2 . ( x . a + 1) 



'1 



and the expected U gain is 



{ (1 - f) X 2 + a " 2 ' X l - (1 - f) X 2 " l - f(l - f) X 2 + a ' 2 - (x - a + 1)} 

• u(x 2 + 2a - 3) 

Case 3UB : The F-E goes up to the line y - 1 from the area 8 ? , and 
further goes up to the line y = from the area B 1 . The probability 
for the Case 3U B is 

| 1 - (1 - f) X 2 + a " 2 ' X l J 

- i (1 - f + uf) | (1 - f + uf) X 2 + a " 2 " X l - (1 - f) X 2 + a ' 2 ' X l } 
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and the expected U gain is 

[{ 1 - (1 - f) X 2 + a " 2 " x l} - 1(1 - f + uf) {(1 - f + uf) X 2 + a " 2 " X ] 

- (1 - f) X 2 + a " 2 " X l } ] • u(x + 2a - 3) 

Case 3U 63 : The F-E goes up to the line y = 1 from the area 8 ?i and 
further goes up to the line y = from the area d . The probability 
for the Case 3U63 is 

A (1 - f + uf) { 1 - (1 - f) X l}{(l - f + uf) x 2 + a ' 2 ' X l 



u 



- (1 - f) X 2 + a ' 2 ' X l} 
and the expected U gain is 

i (1 - f + uf) { 1 - (1 - f) X l } { (1 - f + uf) x 2 + a " 2 " X l 
- (1 - f) X 2 + a " 2 " X l} • u(x 2 + 2a - 3) 

By summing up all these expected U gains we have an explicit 
expression for V*(x ? - 1 , y = 2) ; 

(15) V*(x 2 - 1, y = 2) = u(x 2 + 2a - 3) - 2u(a - 1)(1 - f) X 2 " 1 

- (a - 1)(1 - f) X l (1 - f + uf) X 2 + a " l ~ X l 
+ (a - 1)(1 -f) X 2 + a " l . 
Thus, finally, the delta function for y = 2 is expressed as 

(16) 6(x 2 , y = 2) = V*(x 2 + a - 1, y = 1) - V*(x 2 - 1 , y = 1) - 1 

= (a - 1)(1 - f) X l (1 - u)(l - f + uf) X 2 + a " l " x l 



(a - 1)(1 - f) X 2 - l { 2u - (1 - f) a } - (1 - u) 
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The solution x ? is therefore obtained as the maximum integer which makes 

5( x , y = 2) given in (16) non-negative. The tedious procedure thus 
portrayed for obtaining x ? is certainly enough to discourage us from 
proceeding any further to seek the solution for y = 3 and higher values. 
In the next section, therefore, we turn our attention to numerical methods 
which will give us the decision-shifting points of higher orders for a given 
set of parameters (permanent decision context) with the aid of a computer. 

5. The recurrent relation of the expected U gain functions 

The computational procedure of our numerical method will most 
conveniently be described in terms of the recurrent relation of the 
expected U gain functions expressed as follows: 

(17) V*(x , y) = t*(x , y) • V*(x + a - 1, y - 1) + 7*(x , y) • V*(x - 1 , y) 

+ U*(x , y) 

where t*(x , y) is the probability of taking fungus under the given optimal 
strategy for the internal decision context I = (x , y) , and t*(x , y) is the 
complement of t*(x , y). U*(x , y) is the expected U gain under the same 
strategy for the unit travel from I = (x , y) to I = ( x - 1 , y) . 

When F-E is in the F decision region, he will take fungus whichever 
of the external decision contexts, z = (F O) and z = (F U), occurs. 
Therefore, t*(x , y) is exactly f as long as F-E is in the F decision 
region, and he proceeds to the next choice point where his internal decision 
context is expressed as I = (x + a - 1, y - 1) with probability f. Furthermore, 
if there happens no external decision context which contains fungus as its 
entry when F-E's internal decision context is I = ( x , y), he just 
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proceeds one step toward the starvation line where his internal decision 
context is I = (x - 1, y) , and he may luckily pick up a piece of uranium 
with probability u(l - f) , this is the probability of z taking the value 
z = (O U) . Therefore, U*(x , y) is equal to u(l - f) when F-E is in the 
F decision region. 

Analogously when F- E is in the U decision region, he will take 
fungus only if the external decision context z = (F O) is given, and, 
therefore, t*(x , y) is equal to f(l - u) Also U*(x , y) is equal to u 

when F-E is in the U decision region. Now, the foregoing can be 
abbreviated as follow: 



(18) 



(19) 



t*(x , y) = f 

= f(l - u) 
U*(x , y) = u(l - f) 



= u 



if I=(x , y) is in the F decision region 
if I=(x , y) is in the U decision region 
if I=(x , y) is in the F decision region 
if I=(x , y) is in the U decision region 



6. A numerical method for obtaining the optimal solution 

By virtue of Theorem 8 A the optimal decision on the critical level, 
namely when y = 0, is always the decision U. Therefore, the expected U 
gain for I = (x , y = 0) is a linear function of x , as given in (5). To restate; 



(5) 



V*(x , y = 0) = ux , 



where u = is assumed. 

On each absorption barrier, there is no further U return. This yields 
the two boundary conditions, namely 



(20) 



V*(n = 0) = , 



Toda op. cit. 
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(21) V*(x = 0, y) = . 

Since the optimal decisions and the expected U gain function under the 
optimal strategy for y = are already given, let us proceed to the case 
when y = 1 . 

When I = (1 , 1) , the delta function will be expressed as 

6(1,1) = V*(a , 0) - V*(0 , 1) - 1 . 

By the boundary conditions and the equation (5), the above equation can be 
re -written as 

6(1 f 1) = ua - 1 . 

Therefore, if ua -1^0, the optimal decision is F and I = ( 1 , 1) belongs 
to the F decision region. Thus, by putting t*(l , 1) = f , and U*(l , 1) = 
u(l - f ) we have from (17) 

V*(l , 1) = f-V*(a , o) + (1 - f)-V*(0 , 1) + U*(l , 1) 

= fua + u(l - f) 
and if ua - 1 < 0, I = (1 , 1) belongs to the U decision region and the 
expected U gain function is of the form, 

V*(l , 1) = f(l - u)-V*(a , 0) + (1 - f +uf). V*(0 , 1) + U*(l ,1) 

= f ( 1 - u) • ua + u 

Then, we proceed to the point of I = (2 , 1), where the delta function 
and the expected U gain function are written as 
6(2 , 1) = V*(a +1,0)- V*(l , 1) - 1 , 

V*(2 , 1) = t*(2 , 1)-V*(a +1,0) + t*(2 , 1).V*(1 , 1) 

+ U*(2 , 1) . 



Here we include the neutral decision point within the F decision region. 
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Since we have already calculated V*(a +1,0) and V*(l , 1), the sign 
of 6(2, 1) is easily obtained and so the U gain V*(2 , 1) 

Apply the same technique successively to 6(3, 1), V*(3 , 1), 
6 (4 , 1), and so on, until we cover some reasonably large range of x 
for y = 1. Then proceed to y = 2 applying the same technique. Thus, 
we will eventually cover a desired range of x and y within which the 
F- and U- regions and the values of V* will be computed for the given 
set of parameters, u, f, and a. 

Usually, on each fixed value of y except y = 0, the sign of the delta 
function will be positive for relatively small values of x , and then it 
turns to negative after passing the decision shifting point and will never 
be positive again. In other words 6(x , y) seems to be a monotone 
decreasing function of x for any y. Although this hypothesis has not 
been proved, we are convinced of its truth and used it to simplify our 
computer program described later. 

7. Some examples of actual computation 

A DEC PDP-1 computer was used for our computation. Fig. 3 shows 
an example of the flow chart for the computation. Some examples of the 
results of computations for the delta function are shown in Fig. 4. The 
dotted line of each graph represents the border between the F decision region 
and the U decision region. In Fig. 5 are presented the values of V* plotted 
against the internal decision context x for several values of y . 

8. A pilot experiment for G4 

In this section, we report an experiment in which G4 was played by 
human subjects. The subjects were the staff's wives and secretaries of 
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Institute for Research, State College, Pennsylvania. All the subjects 
being gathered in a single room were taught the rules of G4 including 
the values of the parameters, u, f, and a, which were different from 
session to session. On each session an IC diagram sheet was given to 
each subject on which were drawn the starvation absorption barrier and 
the Dooms Day barrier. 

Before the presentation of the external decision context a M predecision n 
was required to be made on each trial. Here a n predecision M means the 
decision to be made prior to the presentation of the actual external decision 
context and supposing the external decision context as z = (F U), namely 
the decision which subject will make if the external decision context turns 
out to be the non-trivial decision context. Ss were also required to make 
the predecisions for the future choice points, the choice points where he 
might possibly be in future, as many as possible. 

When the predecisions were made, the experimenter threw two dice, 
one (twenty-sided) for the probability f and the other (ten-sided) for the 
probability u, and announced the external decision context realized on 
that trial. It was required that the actual decisions to be made after the 
announcement be same as the predecision if the trial turned out to be a 
decision trial. If it turned out not to be the decision trial, Ss were not 
restrained to their predecisions as the case of a decision trial. They 
might take one uranium or one fungus according to the realized external 
decision context or they might just proceed one step to the starvation 
line if the external decision context was z = (O O) 

Depending upu.i the real decision made, Ss moved to the next choice 
point on their IC diagram sheet, and recorded their new predecisions 
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there on the sheets. The subjects were requested to attempt to make their 
U return maximum at the end of each session. They were paid a quarter 
for each uranium they collected. 

The values of the parameters and the results are shown in the 
following pages. Traces of Ss 1 actual locomotion are plotted on the IC 
diagrams and the corresponding optimal decision shifting line are given so 
that we can compare Ss 1 decisions with the optimal decisions. Double 
lines show that the trial which happened between two choice points was a 
decision trial. 

Each circle above the locomotion line shows that S gained a piece of uranium 
there. Subject's predecisions were also plotted on the IC diagram, but 
we did not show all of them except the predecision made just for the next 
trial. Letters F and U on the locomotion lines are predecisions made by Ss. 
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Values of parameters (u=. 5 f=.4 a=3) : Starting conditions (x=9 y=7 L=31) 
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9. Conclusion 

We used a digital computer to solve the optimal decision problem 
of the F-E game, though G4 could not be regarded as a very complex game. 
We can anticipate from this point of view that it will be getting more and 
more difficult to get the optimal solution analytically in more elaborated 
F-E games . Therefore, our primary concern will be to get the 
computer program which will give us the optimal solution of each game. 

As for the results of experiments, we have no right to discuss them in 
detail, because of the lack of sufficient data to uncover decision strategies 
actually employed by Ss. For that purpose we certainly need to repeat 
the same experiment over and over again with the same parameter values 
so that the locomotion traces of each subject cover a fairly large part 
of the IC diagram. 

One thing we might be able to say is about the type of strategies 
used by most of the subjects on most of the games. It is the type of 
strategy one of the authors called the critical x-value strategy or the 
economist's strategy . Every subject seems to be trying to follow his 
own decision-shifting line . The critical x-value seems to vary from 
subject to subject, and also vary with the values of parameters. At 
least, therefore, we may say that the observed strategies were not 
very far from optimal, contaminated, though, with individual's 
characteristic biases. 



1 

Toda, op. cit. 
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